What is a Space Filling Curve?
A Space-Filling Curve (SFC) is a special kind of curve that passes through every point in a grid or space, visiting them in a specific order. Think of it like a clever way to turn 2D (or higher) data into 1D, while preserving spatial closeness as much as possible.
-How It Works
The curve visits all grid cells in a recursive, structured order. The most common SFCs are:
Z-order (Morton Order): Interleaves binary bits of coordinates (easy to compute, good locality)
Hilbert Curve: More complex but keeps neighbors very close in 1D
Peano Curve: Another variant that fills in a more dense path
The grid is divided recursively, and the curve visits each small cell in a particular pattern.
-Where It's Used
Spatial indexing (especially in R-Trees, databases like PostgreSQL/SpatiaLite)
Image compression and memory layout
Big data systems (e.g., Apache HBase uses Z-ordering)
Physics simulations and cache-efficient computation
-Strengths
Preserves spatial locality: nearby points stay close in 1D order
Great for indexing and sorting spatial data
Works well for range queries and KNN when paired with other data structures
-Weaknesses
Doesn’t preserve all spatial relationships (far points may still appear close)
Harder to understand and visualize than trees
More complex to implement for high dimensions
-Example
If you have points on a 4×4 grid, a Z-order curve will visit them in a zigzag-like binary bit pattern:
(0,0) → (1,0) → (0,1) → (1,1) ...
By mapping each point to a single Z-value, you can sort or search more easily.
Complexities
Operation |
Average Case |
Worst Case |
Bulk Loading | O(n log n) | O(n log n) |
Insertion | O(log n) | O(logn) |
Deletion | O(log n) | O(logn) |
Range Query | O(log n + m) | O(n) |
k-NN Query | O(k log n) | O(kn) |
Insertion:
When a new point is added, its Z-value (or Hilbert value) is computed to determine its 1D position on the curve. The point is then inserted into a sorted structure like a balanced tree or array. This lookup and insert step is done in O(log n).
Deletion:
The point to be removed is located by its Z-value and deleted from the sorted list. Since the data is stored in order, binary search allows fast access and removal, also in O(log n).
k-NN Search:
After locating the query point’s position on the curve, its neighboring entries are scanned. The k closest points (by real Euclidean distance) are selected. Using curves like Hilbert helps preserve spatial proximity better, improving efficiency. The complexity is O(k log n).
Range Query:
The multidimensional query box is transformed into corresponding 1D intervals along the curve. These intervals are scanned in the sorted structure. Each candidate point is checked to see if it truly lies within the original multidimensional range. Complexity is O(log n + m), where m is the number of results.